Graph Attention Transformer Network for Multi-label Image Classification
نویسندگان
چکیده
Multi-label classification aims to recognize multiple objects or attributes from images. The key solving this issue relies on effectively characterizing the inter-label correlations dependencies, which bring prevailing graph neural network. However, current methods often use co-occurrence probability of labels based training set as adjacency matrix model correlation, is greatly limited by dataset and affects model’s generalization ability. This article proposes a Graph Attention Transformer Network, general framework for multi-label image mining rich effective label correlation. First, we cosine similarity value pre-trained word embedding initial correlation matrix, can represent richer semantic information than one. Subsequently, propose attention transformer layer transfer adapt domain. Our extensive experiments have demonstrated that our proposed achieve highly competitive performance three datasets.
منابع مشابه
Matrix Completion for Multi-label Image Classification
Recently, image categorization has been an active research topic due to the urgent need to retrieve and browse digital images via semantic keywords. This paper formulates image categorization as a multi-label classification problem using recent advances in matrix completion. Under this setting, classification of testing data is posed as a problem of completing unknown label entries on a data ma...
متن کاملOrder-Free RNN with Visual Attention for Multi-Label Classification
We propose a recurrent neural network (RNN) based model for image multi-label classification. Our model uniquely integrates and learning of visual attention and Long Short Term Memory (LSTM) layers, which jointly learns the labels of interest and their co-occurrences, while the associated image regions are visually attended. Different from existing approaches utilize either model in their netwo...
متن کاملProximity-based Graph Embeddings for Multi-label Classification
In many real applications of text mining, information retrieval and natural language processing, large-scale features are frequently used, which often make the employed machine learning algorithms intractable, leading to the well-known problem “curse of dimensionality”. Aiming at not only removing the redundant information from the original features but also improving their discriminating abili...
متن کاملMulti-label Image Classification with A Probabilistic Label Enhancement Model
In this paper, we present a novel probabilistic label enhancement model to tackle multi-label image classification problem. Recognizing multiple objects in images is a challenging problem due to label sparsity, appearance variations of the objects and occlusions. We propose to tackle these difficulties from a novel perspective by constructing auxiliary labels in the output space. Our idea is to...
متن کاملBelief Theory for Large-Scale Multi-label Image Classification
Classifier combination is known to generally perform better than each individual classifier by taking into account the complementarity between the input pieces of information. Dempster-Shafer theory is a framework of interest to make such a fusion at the decision level, and allows in addition to handle the conflict that can exist between the classifiers as well as the uncertainty that remains o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Multimedia Computing, Communications, and Applications
سال: 2023
ISSN: ['1551-6857', '1551-6865']
DOI: https://doi.org/10.1145/3578518